Yolo County
- North America > United States > California > Yolo County > Davis (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Health & Medicine (0.94)
- Information Technology (0.67)
- North America > United States > California > Yolo County > Davis (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Jiangsu Province (0.04)
- North America > United States > California > Yolo County > Davis (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (8 more...)
- Asia > China (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- Asia > Nepal (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine (0.67)
- Social Sector (0.46)
- North America > United States > California > Yolo County > Davis (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report (0.67)
- Workflow (0.46)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > Yolo County > Davis (0.14)
- North America > United States > Massachusetts > Middlesex County > Waltham (0.04)
- (3 more...)
RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning
Reinforcement Learning from Human Feedback (RLHF) has recently surged in popularity, particularly for aligning large language models and other AI systems with human intentions. At its core, RLHF can be viewed as a specialized instance of Preference-based Reinforcement Learning (PbRL), where the preferences specifically originate from human judgments rather than arbitrary evaluators. Despite this connection, most existing approaches in both RLHF and PbRL primarily focus on optimizing a mean reward objective, neglecting scenarios that necessitate risk-awareness, such as AI safety, healthcare, and autonomous driving. These scenarios often operate under a one-episode-reward setting, which makes conventional risk-sensitive objectives inapplicable.
- North America > United States > Oregon (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- Workflow (0.68)
- Transportation > Ground > Road (0.34)
- Information Technology > Robotics & Automation (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > China (0.04)
- South America > Peru > Lima Department > Lima Province > Lima (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Yolo County > Davis (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > California > Yolo County > Davis (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Data Science > Data Mining (0.67)